10. Mediapipe gesture control robotic arm action group

10.1. Introduction

MediaPipe is a data stream processing machine learning application development framework developed and open sourced by Google. It is a graph-based data processing pipeline for building data sources that use many forms, such as video, audio, sensor data, and any time series data. MediaPipe is cross-platform and can run on embedded platforms (Raspberry Pi, etc.), mobile devices (iOS and Android), workstations and servers, and supports mobile GPU acceleration. MediaPipe provides cross-platform, customizable ML solutions for real-time and streaming.

10.2. Using

Note: [R2] of the remote controller has the function of [pause/on] for this gameplay.

The case in this section may run very slowly on the robot master. It is recommended to connect the camera on the virtual machine side and run the file[02_PoseCtrlArm.launch]. The NX master control will work better, you can try it.

After the program is running, press the handle's R2 key to touch the control. The camera will capture the image, there are six gestures, as follows

Here, when each gesture is finished, it will return to the initial position and beep, waiting for the next gesture recognition.

MediaPipe Hands infers the 3D coordinates of 21 hand-valued joints from a frame.

hand_landmarks

 

 

10.4. Core files

10.4.1、mediaArm.launch
10.4.2、 FingerCtrl.py

The implementation process here is also very simple. The main function opens the camera to obtain data and then passes it into the process function. Inside it, "detect palm" -> "obtain finger coordinates" -> "obtain gestures" in sequence, and then determine the needs according to the gesture results action performed.

10.5. Flowchart

image-20220613184104987